Search CORE

5 research outputs found

Mixed-precision deep learning based on computational memory

Author: Antonakopoulos Theodore
Boybat Irem
Egger Urs
Eleftheriou Evangelos
Gallo Manuel Le
Joshi Vinay
Karunaratne Geethan
Khaddam-Aljameh Riduan
Mariani Giovanni
Nandakumar S. R.
Petropoulos Anastasios
Piveteau Christophe
Rajendran Bipin
Sebastian Abu
Publication venue: 'Frontiers Media SA'
Publication date: 31/01/2020
Field of study

Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally intensive and this has motivated the search for novel computing architectures targeting this application. A computational memory unit with nanoscale resistive memory devices organized in crossbar arrays could store the synaptic weights in their conductance states and perform the expensive weighted summations in place in a non-von Neumann manner. However, updating the conductance states in a reliable manner during the weight update process is a fundamental challenge that limits the training accuracy of such an implementation. Here, we propose a mixed-precision architecture that combines a computational memory unit performing the weighted summations and imprecise conductance updates with a digital processing unit that accumulates the weight updates in high precision. A combined hardware/software training experiment of a multilayer perceptron based on the proposed architecture using a phase-change memory (PCM) array achieves 97.73% test accuracy on the task of classifying handwritten digits (based on the MNIST dataset), within 0.6% of the software baseline. The architecture is further evaluated using accurate behavioral models of PCM on a wide class of networks, namely convolutional neural networks, long-short-term-memory networks, and generative-adversarial networks. Accuracies comparable to those of floating-point implementations are achieved without being constrained by the non-idealities associated with the PCM devices. A system-level study demonstrates 173x improvement in energy efficiency of the architecture when used for training a multilayer perceptron compared with a dedicated fully digital 32-bit implementation

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Repository for Publications and Research Data

King's Research Portal

Computational Memory Design

Author: Khaddam-Aljameh Riduan
Publication venue: ETH Zurich
Publication date: 01/01/2022
Field of study

Repository for Publications and Research Data

Analysis and Comparative Evaluation of Stacked-Transistor Half-Bridge Topologies Implemented with 14 nm Bulk CMOS Technology

Author: Bezerra Pedro A.M.
Brunschwiler Thomas
Khaddam-Aljameh Riduan
Kolar Johann W.
Krismer Florian
Sridhar Arvind
Toifl Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Repository for Publications and Research Data

Crossref

A Multi-Memristive Unit-Cell Array With Diagonal Interconnects for In-Memory Computing

Author: BrightSky Matthew
Bruce Robert L.
Kersting Benedikt
Khaddam-Aljameh Riduan
Le Gallo Manuel
Martemucci Michele
Sebastian Abu
Publication venue: IEEE
Publication date: 28/08/2021
Field of study

Memristive crossbar arrays can be used to realize matrix-vector multiplication (MVM) operations in constant time complexity by exploiting the Kirchhoff's circuit laws. This is enabled by the parallel read of the entire array in a single time step. However, parallel writing is prohibitive in such arrays due to limitations on the current that could be accumulated along the wires. Hence, loading the matrix elements into such an array still incurs significant time penalty. Another key challenge is the achievable computational precision. To overcome these challenges, we propose a unit-cell array design where each unit-cell comprises four memristive devices each attached to a selection transistor. Moreover, the array is organized in such a way that the selection transistors can be turned on in a diagonal fashion. We experimentally demonstrated this concept by fabricating a 2 x 2 unit-cell array based on projected phase-change memory (PCM) devices in 90 nm CMOS technology. It is shown that using the diagonal connections, the write operations can be parallelized while maintaining the current limit of the back-end-of-the-line metallization. Moreover, the increase in write time due to having more devices per unit-cell is minimized through a combination of single-shot and iterative programming schemes. Finally, we present experimental results on MVM operations that demonstrate improved computational precision exceeding that of a 4-bit fixed-point implementation.ISSN:1549-7747ISSN:1057-7130ISSN:1558-3791ISSN:1558-125

Repository for Publications and Research Data

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Experimental Efficiency Evaluation of Stacked Transistor Half-Bridge Topologies in 14 nm CMOS Technology

Author: Bezerra Pedro A.M.
Braendli Matthias
Brunschwiler Thomas
Francese Pier A.
Heller Ralph
Khaddam-Aljameh Riduan
Kolar Johann W.
Kossel Marcel A.
Krismer Florian
Morf Thomas
Paredes Stephan
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

Different Half-Bridge (HB) converter topologies for an Integrated Voltage Regulator (IVR), which serves as a microprocessor application, were evaluated. The HB circuits were implemented with Stacked Transistors (HBSTs) in a cutting-edge 14 nm CMOS technology node in order to enable the integration on the microprocessor die. Compared to a conventional realization of the HBST, it was found that the Active Neutral-Point Clamped (ANPC) HBST topology with Independent Clamp Switches (ICSs) not only ensured balanced blocking voltages across the series-connected transistors, but also featured a more robust operation and achieved higher efficiencies at high output currents. The IVR achieved a maximum efficiency of 85.3% at an output current of 300 mA and a switching frequency of 50 MHz. At the maximum measured output current of 780 mA, the efficiency was 83.1%. The active part of the IVR (power switches, gate-drivers, and level shifters) realized a high maximum current density of 24.7 A/mm2.ISSN:2079-929

Multidisciplinary Digital Publishing Institute

Repository for Publications and Research Data